133 research outputs found

    Link-Prediction Enhanced Consensus Clustering for Complex Networks

    Full text link
    Many real networks that are inferred or collected from data are incomplete due to missing edges. Missing edges can be inherent to the dataset (Facebook friend links will never be complete) or the result of sampling (one may only have access to a portion of the data). The consequence is that downstream analyses that consume the network will often yield less accurate results than if the edges were complete. Community detection algorithms, in particular, often suffer when critical intra-community edges are missing. We propose a novel consensus clustering algorithm to enhance community detection on incomplete networks. Our framework utilizes existing community detection algorithms that process networks imputed by our link prediction based algorithm. The framework then merges their multiple outputs into a final consensus output. On average our method boosts performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook

    Extracting Inter-community Conflicts in Reddit

    Full text link
    Anti-social behaviors in social media can happen both at user and community levels. While a great deal of attention is on the individual as an 'aggressor,' the banning of entire Reddit subcommunities (i.e., subreddits) demonstrates that this is a multi-layer concern. Existing research on inter-community conflict has largely focused on specific subcommunities or ideological opponents. However, antagonistic behaviors may be more pervasive and integrate into the broader network. In this work, we study the landscape of conflicts among subreddits by deriving higher-level (community) behaviors from the way individuals are sanctioned and rewarded. By constructing a conflict network, we characterize different patterns in subreddit-to-subreddit conflicts as well as communities of 'co-targeted' subreddits. By analyzing the dynamics of these interactions, we also observe that the conflict focus shifts over time.Comment: 21 pages, 7 figure

    Hybrid-search and storage of semi-structured information

    Get PDF
    Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 113-118).Given today's tangle of digital information, one of the hardest tasks for computer users of information systems is finding anything in the mess. For a number of well documented reasons including the amazing growth in the Internet's popularity and the drop in the cost of storage, the amount of information on the net as well as on a user's local computer, has increased dramatically in recent years. Although this readily available information should be extremely beneficial for computer users, paradoxically it is now much harder to find anything. Many different solutions have been proposed to the general information seeking task of users, but few if any have addressed the needs of individuals or have leveraged the benefit of single-user interaction. The Haystack project is an attempt to answer the needs of the individual user. Once the user's information is represented in Haystack, the types of questions users may ask are highly varied. In this thesis we will propose a means of representing information in a robust framework within Haystack. Once the information is represented we describe a mechanism by which the diverse questions of the individual can be answered. This novel method functions by using a combination of existing information systems. We will call this combined system a hybrid-search system.by Eytan Adar.M.Eng

    THE COMPETITIVE DYNAMICS OF WEB SITES

    Get PDF
    The phenomenon of electronic commerce has led to a proliferation of web sites competing for the attention and resources of millions of consumers. As has been recently shown, the resulting dynamics is such that a few sites command most of the traffic in the web, a signature of a winner-take-all market[1].In order to explore the effects of competition among web sites and to determine how they affect the nature of markets, we present a dynamical model of web site growth and competition, which takes into account both the evolution of the probabilities that a person visits sites as well as the existence of links between sites.We show that under general conditions, as the competition between sites increases, the model exhibits a sudden transition from a regime in which many sites thrive simultaneously, to a "winner take all market" in which a few sites grab almost all the users, while most other sites go nearly extinct. This transition is similar to what in ecology is called the Principle of Competitive Exclusion.Furthermore, we study the effect of site linkage on the number of visitors to both competitive and cooperative sites, as well as the implications of our results for linking strategies as practiced on the web. Common link structures found on the web, such as web rings, are also analyzed.Finally, we show web usage data that motivates our work and places the different behaviors in context.
    • …
    corecore